Overview

Dataset statistics

Number of variables20
Number of observations203
Missing cells2
Missing cells (%)< 0.1%
Duplicate rows1
Duplicate rows (%)0.5%
Total size in memory31.8 KiB
Average record size in memory160.6 B

Variable types

Categorical9
Numeric11

Alerts

Dataset has 1 (0.5%) duplicate rowsDuplicates
wheel_base is highly correlated with length and 7 other fieldsHigh correlation
length is highly correlated with wheel_base and 8 other fieldsHigh correlation
width is highly correlated with wheel_base and 7 other fieldsHigh correlation
height is highly correlated with wheel_base and 1 other fieldsHigh correlation
curb_weight is highly correlated with wheel_base and 7 other fieldsHigh correlation
engine_size is highly correlated with wheel_base and 7 other fieldsHigh correlation
horsepower is highly correlated with wheel_base and 7 other fieldsHigh correlation
city_mpg is highly correlated with length and 6 other fieldsHigh correlation
highway_mpg is highly correlated with wheel_base and 7 other fieldsHigh correlation
price is highly correlated with wheel_base and 7 other fieldsHigh correlation
wheel_base is highly correlated with length and 6 other fieldsHigh correlation
length is highly correlated with wheel_base and 7 other fieldsHigh correlation
width is highly correlated with wheel_base and 7 other fieldsHigh correlation
height is highly correlated with wheel_baseHigh correlation
curb_weight is highly correlated with wheel_base and 7 other fieldsHigh correlation
engine_size is highly correlated with wheel_base and 7 other fieldsHigh correlation
horsepower is highly correlated with length and 6 other fieldsHigh correlation
city_mpg is highly correlated with length and 6 other fieldsHigh correlation
highway_mpg is highly correlated with wheel_base and 7 other fieldsHigh correlation
price is highly correlated with wheel_base and 7 other fieldsHigh correlation
wheel_base is highly correlated with length and 4 other fieldsHigh correlation
length is highly correlated with wheel_base and 6 other fieldsHigh correlation
width is highly correlated with wheel_base and 7 other fieldsHigh correlation
curb_weight is highly correlated with wheel_base and 7 other fieldsHigh correlation
engine_size is highly correlated with wheel_base and 7 other fieldsHigh correlation
horsepower is highly correlated with width and 5 other fieldsHigh correlation
city_mpg is highly correlated with length and 6 other fieldsHigh correlation
highway_mpg is highly correlated with length and 6 other fieldsHigh correlation
price is highly correlated with wheel_base and 7 other fieldsHigh correlation
num_of_cylinders is highly correlated with engine_typeHigh correlation
body_style is highly correlated with num_of_doorsHigh correlation
make is highly correlated with engine_type and 2 other fieldsHigh correlation
num_of_doors is highly correlated with body_styleHigh correlation
engine_type is highly correlated with num_of_cylinders and 1 other fieldsHigh correlation
fuel_system is highly correlated with make and 1 other fieldsHigh correlation
fuel_type is highly correlated with fuel_systemHigh correlation
engine_location is highly correlated with makeHigh correlation
make is highly correlated with body_style and 16 other fieldsHigh correlation
fuel_type is highly correlated with fuel_system and 1 other fieldsHigh correlation
num_of_doors is highly correlated with body_style and 2 other fieldsHigh correlation
body_style is highly correlated with make and 4 other fieldsHigh correlation
drive_wheels is highly correlated with make and 8 other fieldsHigh correlation
engine_location is highly correlated with make and 4 other fieldsHigh correlation
wheel_base is highly correlated with make and 15 other fieldsHigh correlation
length is highly correlated with make and 13 other fieldsHigh correlation
width is highly correlated with make and 11 other fieldsHigh correlation
height is highly correlated with make and 11 other fieldsHigh correlation
curb_weight is highly correlated with make and 13 other fieldsHigh correlation
engine_type is highly correlated with make and 11 other fieldsHigh correlation
num_of_cylinders is highly correlated with make and 12 other fieldsHigh correlation
engine_size is highly correlated with make and 14 other fieldsHigh correlation
fuel_system is highly correlated with make and 13 other fieldsHigh correlation
compression_ratio is highly correlated with make and 2 other fieldsHigh correlation
horsepower is highly correlated with make and 12 other fieldsHigh correlation
city_mpg is highly correlated with make and 10 other fieldsHigh correlation
highway_mpg is highly correlated with make and 13 other fieldsHigh correlation
price is highly correlated with make and 13 other fieldsHigh correlation
price has 4 (2.0%) zeros Zeros

Reproduction

Analysis started2022-07-26 22:05:26.568928
Analysis finished2022-07-26 22:05:50.342819
Duration23.77 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

make
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct21
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
toyota
32 
nissan
18 
mazda
17 
honda
13 
mitsubishi
13 
Other values (16)
110 

Length

Max length13
Median length11
Mean length6.472906404
Min length3

Characters and Unicode

Total characters1314
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st rowalfa-romero
2nd rowalfa-romero
3rd rowalfa-romero
4th rowaudi
5th rowaudi

Common Values

ValueCountFrequency (%)
toyota32
15.8%
nissan18
 
8.9%
mazda17
 
8.4%
honda13
 
6.4%
mitsubishi13
 
6.4%
subaru12
 
5.9%
volkswagen12
 
5.9%
peugot11
 
5.4%
volvo11
 
5.4%
dodge9
 
4.4%
Other values (11)55
27.1%

Length

2022-07-26T23:05:50.677857image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
toyota32
15.8%
nissan18
 
8.9%
mazda17
 
8.4%
honda13
 
6.4%
mitsubishi13
 
6.4%
subaru12
 
5.9%
volkswagen12
 
5.9%
peugot11
 
5.4%
volvo11
 
5.4%
dodge9
 
4.4%
Other values (11)55
27.1%

Most occurring characters

ValueCountFrequency (%)
a152
 
11.6%
o152
 
11.6%
s109
 
8.3%
t98
 
7.5%
e79
 
6.0%
u74
 
5.6%
n69
 
5.3%
i68
 
5.2%
d63
 
4.8%
m57
 
4.3%
Other values (15)393
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1303
99.2%
Dash Punctuation11
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a152
 
11.7%
o152
 
11.7%
s109
 
8.4%
t98
 
7.5%
e79
 
6.1%
u74
 
5.7%
n69
 
5.3%
i68
 
5.2%
d63
 
4.8%
m57
 
4.4%
Other values (14)382
29.3%
Dash Punctuation
ValueCountFrequency (%)
-11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1303
99.2%
Common11
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a152
 
11.7%
o152
 
11.7%
s109
 
8.4%
t98
 
7.5%
e79
 
6.1%
u74
 
5.7%
n69
 
5.3%
i68
 
5.2%
d63
 
4.8%
m57
 
4.4%
Other values (14)382
29.3%
Common
ValueCountFrequency (%)
-11
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1314
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a152
 
11.6%
o152
 
11.6%
s109
 
8.3%
t98
 
7.5%
e79
 
6.0%
u74
 
5.6%
n69
 
5.3%
i68
 
5.2%
d63
 
4.8%
m57
 
4.3%
Other values (15)393
29.9%

fuel_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
gas
183 
diesel
20 

Length

Max length6
Median length3
Mean length3.295566502
Min length3

Characters and Unicode

Total characters669
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgas
2nd rowgas
3rd rowgas
4th rowgas
5th rowgas

Common Values

ValueCountFrequency (%)
gas183
90.1%
diesel20
 
9.9%

Length

2022-07-26T23:05:50.791728image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-26T23:05:50.945158image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
gas183
90.1%
diesel20
 
9.9%

Most occurring characters

ValueCountFrequency (%)
s203
30.3%
g183
27.4%
a183
27.4%
e40
 
6.0%
d20
 
3.0%
i20
 
3.0%
l20
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter669
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s203
30.3%
g183
27.4%
a183
27.4%
e40
 
6.0%
d20
 
3.0%
i20
 
3.0%
l20
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Latin669
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s203
30.3%
g183
27.4%
a183
27.4%
e40
 
6.0%
d20
 
3.0%
i20
 
3.0%
l20
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII669
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s203
30.3%
g183
27.4%
a183
27.4%
e40
 
6.0%
d20
 
3.0%
i20
 
3.0%
l20
 
3.0%

num_of_doors
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.0%
Missing2
Missing (%)1.0%
Memory size1.7 KiB
four
113 
two
88 

Length

Max length4
Median length4
Mean length3.562189055
Min length3

Characters and Unicode

Total characters716
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtwo
2nd rowtwo
3rd rowtwo
4th rowfour
5th rowfour

Common Values

ValueCountFrequency (%)
four113
55.7%
two88
43.3%
(Missing)2
 
1.0%

Length

2022-07-26T23:05:51.057154image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-26T23:05:51.169128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
four113
56.2%
two88
43.8%

Most occurring characters

ValueCountFrequency (%)
o201
28.1%
f113
15.8%
u113
15.8%
r113
15.8%
t88
12.3%
w88
12.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter716
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o201
28.1%
f113
15.8%
u113
15.8%
r113
15.8%
t88
12.3%
w88
12.3%

Most occurring scripts

ValueCountFrequency (%)
Latin716
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o201
28.1%
f113
15.8%
u113
15.8%
r113
15.8%
t88
12.3%
w88
12.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII716
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o201
28.1%
f113
15.8%
u113
15.8%
r113
15.8%
t88
12.3%
w88
12.3%

body_style
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
sedan
96 
hatchback
69 
wagon
24 
hardtop
 
8
convertible
 
6

Length

Max length11
Median length5
Mean length6.615763547
Min length5

Characters and Unicode

Total characters1343
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowconvertible
2nd rowconvertible
3rd rowhatchback
4th rowsedan
5th rowsedan

Common Values

ValueCountFrequency (%)
sedan96
47.3%
hatchback69
34.0%
wagon24
 
11.8%
hardtop8
 
3.9%
convertible6
 
3.0%

Length

2022-07-26T23:05:51.281162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-26T23:05:51.413516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
sedan96
47.3%
hatchback69
34.0%
wagon24
 
11.8%
hardtop8
 
3.9%
convertible6
 
3.0%

Most occurring characters

ValueCountFrequency (%)
a266
19.8%
h146
10.9%
c144
10.7%
n126
9.4%
e108
8.0%
d104
 
7.7%
s96
 
7.1%
t83
 
6.2%
b75
 
5.6%
k69
 
5.1%
Other values (8)126
9.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1343
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a266
19.8%
h146
10.9%
c144
10.7%
n126
9.4%
e108
8.0%
d104
 
7.7%
s96
 
7.1%
t83
 
6.2%
b75
 
5.6%
k69
 
5.1%
Other values (8)126
9.4%

Most occurring scripts

ValueCountFrequency (%)
Latin1343
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a266
19.8%
h146
10.9%
c144
10.7%
n126
9.4%
e108
8.0%
d104
 
7.7%
s96
 
7.1%
t83
 
6.2%
b75
 
5.6%
k69
 
5.1%
Other values (8)126
9.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1343
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a266
19.8%
h146
10.9%
c144
10.7%
n126
9.4%
e108
8.0%
d104
 
7.7%
s96
 
7.1%
t83
 
6.2%
b75
 
5.6%
k69
 
5.1%
Other values (8)126
9.4%

drive_wheels
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
fwd
118 
rwd
76 
4wd
 
5
4wd
 
4

Length

Max length4
Median length3
Mean length3.024630542
Min length3

Characters and Unicode

Total characters614
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowrwd
2nd rowrwd
3rd rowrwd
4th rowfwd
5th row4wd

Common Values

ValueCountFrequency (%)
fwd118
58.1%
rwd76
37.4%
4wd 5
 
2.5%
4wd4
 
2.0%

Length

2022-07-26T23:05:51.545942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-26T23:05:51.676236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
fwd118
58.1%
rwd76
37.4%
4wd9
 
4.4%

Most occurring characters

ValueCountFrequency (%)
w203
33.1%
d203
33.1%
f118
19.2%
r76
 
12.4%
49
 
1.5%
5
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter600
97.7%
Decimal Number9
 
1.5%
Space Separator5
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w203
33.8%
d203
33.8%
f118
19.7%
r76
 
12.7%
Decimal Number
ValueCountFrequency (%)
49
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin600
97.7%
Common14
 
2.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
w203
33.8%
d203
33.8%
f118
19.7%
r76
 
12.7%
Common
ValueCountFrequency (%)
49
64.3%
5
35.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII614
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
w203
33.1%
d203
33.1%
f118
19.2%
r76
 
12.4%
49
 
1.5%
5
 
0.8%

engine_location
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
front
200 
rear
 
3

Length

Max length5
Median length5
Mean length4.985221675
Min length4

Characters and Unicode

Total characters1012
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfront
2nd rowfront
3rd rowfront
4th rowfront
5th rowfront

Common Values

ValueCountFrequency (%)
front200
98.5%
rear3
 
1.5%

Length

2022-07-26T23:05:51.790392image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-26T23:05:51.922681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
front200
98.5%
rear3
 
1.5%

Most occurring characters

ValueCountFrequency (%)
r206
20.4%
f200
19.8%
o200
19.8%
n200
19.8%
t200
19.8%
e3
 
0.3%
a3
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1012
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r206
20.4%
f200
19.8%
o200
19.8%
n200
19.8%
t200
19.8%
e3
 
0.3%
a3
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin1012
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r206
20.4%
f200
19.8%
o200
19.8%
n200
19.8%
t200
19.8%
e3
 
0.3%
a3
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1012
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r206
20.4%
f200
19.8%
o200
19.8%
n200
19.8%
t200
19.8%
e3
 
0.3%
a3
 
0.3%

wheel_base
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct52
Distinct (%)25.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.78275862
Minimum86.6
Maximum120.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:52.054965image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum86.6
5-th percentile93.01
Q194.5
median97
Q3102.4
95-th percentile110
Maximum120.9
Range34.3
Interquartile range (IQR)7.9

Descriptive statistics

Standard deviation6.04567993
Coefficient of variation (CV)0.0612017726
Kurtosis0.9746980226
Mean98.78275862
Median Absolute Deviation (MAD)2.8
Skewness1.035913937
Sum20052.9
Variance36.55024582
MonotonicityNot monotonic
2022-07-26T23:05:52.205523image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
94.521
 
10.3%
93.720
 
9.9%
95.713
 
6.4%
96.58
 
3.9%
97.37
 
3.4%
98.47
 
3.4%
104.36
 
3.0%
100.46
 
3.0%
107.96
 
3.0%
98.86
 
3.0%
Other values (42)103
50.7%
ValueCountFrequency (%)
86.62
 
1.0%
88.41
 
0.5%
88.62
 
1.0%
89.53
 
1.5%
91.32
 
1.0%
931
 
0.5%
93.15
 
2.5%
93.31
 
0.5%
93.720
9.9%
94.31
 
0.5%
ValueCountFrequency (%)
120.91
 
0.5%
115.62
 
1.0%
114.24
2.0%
1132
 
1.0%
1121
 
0.5%
1103
1.5%
109.15
2.5%
1081
 
0.5%
107.96
3.0%
106.71
 
0.5%

length
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct74
Distinct (%)36.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean173.9990148
Minimum141.1
Maximum208.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:52.360531image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum141.1
5-th percentile157.12
Q1166.3
median173.2
Q3183.3
95-th percentile196.68
Maximum208.1
Range67
Interquartile range (IQR)17

Descriptive statistics

Standard deviation12.3855113
Coefficient of variation (CV)0.07118150246
Kurtosis-0.09749147682
Mean173.9990148
Median Absolute Deviation (MAD)6.9
Skewness0.1668307949
Sum35321.8
Variance153.4008901
MonotonicityNot monotonic
2022-07-26T23:05:52.513200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
157.315
 
7.4%
188.811
 
5.4%
171.77
 
3.4%
186.77
 
3.4%
166.37
 
3.4%
165.36
 
3.0%
177.86
 
3.0%
176.26
 
3.0%
186.66
 
3.0%
173.25
 
2.5%
Other values (64)127
62.6%
ValueCountFrequency (%)
141.11
 
0.5%
144.62
 
1.0%
1503
 
1.5%
155.93
 
1.5%
156.91
 
0.5%
157.11
 
0.5%
157.315
7.4%
157.91
 
0.5%
158.73
 
1.5%
158.81
 
0.5%
ValueCountFrequency (%)
208.11
 
0.5%
202.62
1.0%
199.62
1.0%
199.21
 
0.5%
198.94
2.0%
1971
 
0.5%
193.81
 
0.5%
192.73
1.5%
191.71
 
0.5%
190.92
1.0%

width
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct43
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.90147783
Minimum60.3
Maximum72.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:52.665670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum60.3
5-th percentile63.6
Q164.05
median65.5
Q366.9
95-th percentile70.48
Maximum72.3
Range12
Interquartile range (IQR)2.85

Descriptive statistics

Standard deviation2.154835176
Coefficient of variation (CV)0.03269782784
Kurtosis0.6840525768
Mean65.90147783
Median Absolute Deviation (MAD)1.4
Skewness0.9094811915
Sum13378
Variance4.643314637
MonotonicityNot monotonic
2022-07-26T23:05:52.818288image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
63.824
 
11.8%
66.522
 
10.8%
65.415
 
7.4%
63.611
 
5.4%
68.410
 
4.9%
64.410
 
4.9%
649
 
4.4%
65.58
 
3.9%
65.27
 
3.4%
65.66
 
3.0%
Other values (33)81
39.9%
ValueCountFrequency (%)
60.31
 
0.5%
61.81
 
0.5%
62.51
 
0.5%
63.41
 
0.5%
63.611
5.4%
63.824
11.8%
63.93
 
1.5%
649
 
4.4%
64.12
 
1.0%
64.26
 
3.0%
ValueCountFrequency (%)
72.31
 
0.5%
721
 
0.5%
71.73
1.5%
71.43
1.5%
70.91
 
0.5%
70.61
 
0.5%
70.51
 
0.5%
70.33
1.5%
69.62
1.0%
68.94
2.0%

height
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct48
Distinct (%)23.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.73349754
Minimum47.8
Maximum59.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:52.968965image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum47.8
5-th percentile49.7
Q152
median54.1
Q355.5
95-th percentile57.5
Maximum59.8
Range12
Interquartile range (IQR)3.5

Descriptive statistics

Standard deviation2.442864145
Coefficient of variation (CV)0.0454625933
Kurtosis-0.432070274
Mean53.73349754
Median Absolute Deviation (MAD)1.6
Skewness0.06351632831
Sum10907.9
Variance5.967585231
MonotonicityNot monotonic
2022-07-26T23:05:53.133614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
50.814
 
6.9%
5212
 
5.9%
55.712
 
5.9%
54.110
 
4.9%
54.510
 
4.9%
55.59
 
4.4%
56.78
 
3.9%
54.38
 
3.9%
52.67
 
3.4%
56.17
 
3.4%
Other values (38)106
52.2%
ValueCountFrequency (%)
47.81
 
0.5%
48.82
 
1.0%
49.42
 
1.0%
49.64
 
2.0%
49.73
 
1.5%
50.26
3.0%
50.51
 
0.5%
50.65
 
2.5%
50.814
6.9%
511
 
0.5%
ValueCountFrequency (%)
59.82
 
1.0%
59.13
 
1.5%
58.74
2.0%
58.31
 
0.5%
57.53
 
1.5%
56.78
3.9%
56.52
 
1.0%
56.32
 
1.0%
56.23
 
1.5%
56.17
3.4%

curb_weight
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct170
Distinct (%)83.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2555.921182
Minimum1488
Maximum4066
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:53.304432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1488
5-th percentile1900.5
Q12145
median2414
Q32943.5
95-th percentile3504
Maximum4066
Range2578
Interquartile range (IQR)798.5

Descriptive statistics

Standard deviation523.2055554
Coefficient of variation (CV)0.2047033214
Kurtosis-0.07270255842
Mean2555.921182
Median Absolute Deviation (MAD)390
Skewness0.6762665084
Sum518852
Variance273744.0532
MonotonicityNot monotonic
2022-07-26T23:05:53.462514image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23854
 
2.0%
19893
 
1.5%
19183
 
1.5%
22753
 
1.5%
24102
 
1.0%
21912
 
1.0%
25352
 
1.0%
20242
 
1.0%
24142
 
1.0%
40662
 
1.0%
Other values (160)178
87.7%
ValueCountFrequency (%)
14881
0.5%
17131
0.5%
18191
0.5%
18371
0.5%
18742
1.0%
18762
1.0%
18891
0.5%
18901
0.5%
19001
0.5%
19051
0.5%
ValueCountFrequency (%)
40662
1.0%
39501
0.5%
39001
0.5%
37701
0.5%
37501
0.5%
37401
0.5%
37151
0.5%
36851
0.5%
35151
0.5%
35051
0.5%

engine_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
ohc
146 
ohcf
15 
ohcv
 
13
dohc
 
12
l
 
12
Other values (2)
 
5

Length

Max length5
Median length3
Mean length3.128078818
Min length1

Characters and Unicode

Total characters635
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st rowdohc
2nd rowdohc
3rd rowohcv
4th rowohc
5th rowohc

Common Values

ValueCountFrequency (%)
ohc146
71.9%
ohcf15
 
7.4%
ohcv13
 
6.4%
dohc12
 
5.9%
l12
 
5.9%
rotor4
 
2.0%
dohcv1
 
0.5%

Length

2022-07-26T23:05:53.608024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-26T23:05:53.753742image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ohc146
71.9%
ohcf15
 
7.4%
ohcv13
 
6.4%
dohc12
 
5.9%
l12
 
5.9%
rotor4
 
2.0%
dohcv1
 
0.5%

Most occurring characters

ValueCountFrequency (%)
o195
30.7%
h187
29.4%
c187
29.4%
f15
 
2.4%
v14
 
2.2%
d13
 
2.0%
l12
 
1.9%
r8
 
1.3%
t4
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter635
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o195
30.7%
h187
29.4%
c187
29.4%
f15
 
2.4%
v14
 
2.2%
d13
 
2.0%
l12
 
1.9%
r8
 
1.3%
t4
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin635
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o195
30.7%
h187
29.4%
c187
29.4%
f15
 
2.4%
v14
 
2.2%
d13
 
2.0%
l12
 
1.9%
r8
 
1.3%
t4
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII635
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o195
30.7%
h187
29.4%
c187
29.4%
f15
 
2.4%
v14
 
2.2%
d13
 
2.0%
l12
 
1.9%
r8
 
1.3%
t4
 
0.6%

num_of_cylinders
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
four
157 
six
24 
five
 
11
eight
 
5
two
 
3
Other values (3)
 
3

Length

Max length6
Median length4
Mean length3.901477833
Min length3

Characters and Unicode

Total characters792
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)1.5%

Sample

1st rowfour
2nd rowfour
3rd rowsix
4th rowfour
5th rowfive

Common Values

ValueCountFrequency (%)
four157
77.3%
six24
 
11.8%
five11
 
5.4%
eight5
 
2.5%
two3
 
1.5%
three1
 
0.5%
twelve1
 
0.5%
tow1
 
0.5%

Length

2022-07-26T23:05:53.876291image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-26T23:05:54.023599image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
four157
77.3%
six24
 
11.8%
five11
 
5.4%
eight5
 
2.5%
two3
 
1.5%
three1
 
0.5%
twelve1
 
0.5%
tow1
 
0.5%

Most occurring characters

ValueCountFrequency (%)
f168
21.2%
o161
20.3%
r158
19.9%
u157
19.8%
i40
 
5.1%
s24
 
3.0%
x24
 
3.0%
e20
 
2.5%
v12
 
1.5%
t11
 
1.4%
Other values (4)17
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter792
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f168
21.2%
o161
20.3%
r158
19.9%
u157
19.8%
i40
 
5.1%
s24
 
3.0%
x24
 
3.0%
e20
 
2.5%
v12
 
1.5%
t11
 
1.4%
Other values (4)17
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
Latin792
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f168
21.2%
o161
20.3%
r158
19.9%
u157
19.8%
i40
 
5.1%
s24
 
3.0%
x24
 
3.0%
e20
 
2.5%
v12
 
1.5%
t11
 
1.4%
Other values (4)17
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII792
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f168
21.2%
o161
20.3%
r158
19.9%
u157
19.8%
i40
 
5.1%
s24
 
3.0%
x24
 
3.0%
e20
 
2.5%
v12
 
1.5%
t11
 
1.4%
Other values (4)17
 
2.1%

engine_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct43
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.8571429
Minimum61
Maximum326
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:54.175651image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum61
5-th percentile90
Q197
median119
Q3143
95-th percentile202.1
Maximum326
Range265
Interquartile range (IQR)46

Descriptive statistics

Standard deviation41.84523922
Coefficient of variation (CV)0.32986112
Kurtosis5.237718427
Mean126.8571429
Median Absolute Deviation (MAD)22
Skewness1.942317823
Sum25752
Variance1751.024045
MonotonicityNot monotonic
2022-07-26T23:05:54.318317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
12215
 
7.4%
9215
 
7.4%
9814
 
6.9%
9714
 
6.9%
10813
 
6.4%
11012
 
5.9%
9012
 
5.9%
1098
 
3.9%
1417
 
3.4%
1207
 
3.4%
Other values (33)86
42.4%
ValueCountFrequency (%)
611
 
0.5%
703
 
1.5%
791
 
0.5%
801
 
0.5%
9012
5.9%
915
 
2.5%
9215
7.4%
9714
6.9%
9814
6.9%
1031
 
0.5%
ValueCountFrequency (%)
3261
 
0.5%
3081
 
0.5%
3041
 
0.5%
2582
 
1.0%
2342
 
1.0%
2093
1.5%
2031
 
0.5%
1943
1.5%
1834
2.0%
1816
3.0%

fuel_system
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
mpfi
92 
2bbl
66 
idi
20 
1bbl
11 
spdi
 
9
Other values (3)
 
5

Length

Max length4
Median length4
Mean length3.896551724
Min length3

Characters and Unicode

Total characters791
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)1.0%

Sample

1st rowmpfi
2nd rowmpfi
3rd rowmpfi
4th rowmpfi
5th rowmpfi

Common Values

ValueCountFrequency (%)
mpfi92
45.3%
2bbl66
32.5%
idi20
 
9.9%
1bbl11
 
5.4%
spdi9
 
4.4%
4bbl3
 
1.5%
mfi1
 
0.5%
spfi1
 
0.5%

Length

2022-07-26T23:05:54.460507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-26T23:05:54.602845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
mpfi92
45.3%
2bbl66
32.5%
idi20
 
9.9%
1bbl11
 
5.4%
spdi9
 
4.4%
4bbl3
 
1.5%
mfi1
 
0.5%
spfi1
 
0.5%

Most occurring characters

ValueCountFrequency (%)
b160
20.2%
i143
18.1%
p102
12.9%
f94
11.9%
m93
11.8%
l80
10.1%
266
8.3%
d29
 
3.7%
111
 
1.4%
s10
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter711
89.9%
Decimal Number80
 
10.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
b160
22.5%
i143
20.1%
p102
14.3%
f94
13.2%
m93
13.1%
l80
11.3%
d29
 
4.1%
s10
 
1.4%
Decimal Number
ValueCountFrequency (%)
266
82.5%
111
 
13.8%
43
 
3.8%

Most occurring scripts

ValueCountFrequency (%)
Latin711
89.9%
Common80
 
10.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
b160
22.5%
i143
20.1%
p102
14.3%
f94
13.2%
m93
13.1%
l80
11.3%
d29
 
4.1%
s10
 
1.4%
Common
ValueCountFrequency (%)
266
82.5%
111
 
13.8%
43
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII791
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b160
20.2%
i143
18.1%
p102
12.9%
f94
11.9%
m93
11.8%
l80
10.1%
266
8.3%
d29
 
3.7%
111
 
1.4%
s10
 
1.3%

compression_ratio
Real number (ℝ≥0)

HIGH CORRELATION

Distinct33
Distinct (%)16.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.45231527
Minimum7
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:54.745564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile7.5
Q18.55
median9
Q39.4
95-th percentile21.99
Maximum70
Range63
Interquartile range (IQR)0.85

Descriptive statistics

Standard deviation5.792527882
Coefficient of variation (CV)0.5541861044
Kurtosis55.58206486
Mean10.45231527
Median Absolute Deviation (MAD)0.4
Skewness6.20515692
Sum2121.82
Variance33.55337927
MonotonicityNot monotonic
2022-07-26T23:05:54.875935image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
946
22.7%
9.426
12.8%
8.514
 
6.9%
9.513
 
6.4%
9.311
 
5.4%
88
 
3.9%
9.28
 
3.9%
8.77
 
3.4%
77
 
3.4%
8.65
 
2.5%
Other values (23)58
28.6%
ValueCountFrequency (%)
77
3.4%
7.55
 
2.5%
7.64
 
2.0%
7.72
 
1.0%
7.81
 
0.5%
88
3.9%
8.12
 
1.0%
8.33
 
1.5%
8.45
 
2.5%
8.514
6.9%
ValueCountFrequency (%)
701
 
0.5%
235
2.5%
22.71
 
0.5%
22.53
1.5%
221
 
0.5%
21.91
 
0.5%
21.54
2.0%
215
2.5%
11.51
 
0.5%
10.11
 
0.5%

horsepower
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct59
Distinct (%)29.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.2561576
Minimum48
Maximum288
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:55.020291image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile62
Q170
median95
Q3116
95-th percentile181.4
Maximum288
Range240
Interquartile range (IQR)46

Descriptive statistics

Standard deviation39.71436879
Coefficient of variation (CV)0.3809306777
Kurtosis2.623279794
Mean104.2561576
Median Absolute Deviation (MAD)25
Skewness1.391029494
Sum21164
Variance1577.231088
MonotonicityNot monotonic
2022-07-26T23:05:55.172902image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6819
 
9.4%
7011
 
5.4%
6910
 
4.9%
1169
 
4.4%
1108
 
3.9%
957
 
3.4%
1146
 
3.0%
1606
 
3.0%
1016
 
3.0%
626
 
3.0%
Other values (49)115
56.7%
ValueCountFrequency (%)
481
 
0.5%
522
 
1.0%
551
 
0.5%
562
 
1.0%
581
 
0.5%
601
 
0.5%
626
 
3.0%
641
 
0.5%
6819
9.4%
6910
4.9%
ValueCountFrequency (%)
2881
 
0.5%
2621
 
0.5%
2073
1.5%
2001
 
0.5%
1842
1.0%
1823
1.5%
1762
1.0%
1751
 
0.5%
1622
1.0%
1612
1.0%

city_mpg
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct29
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.24137931
Minimum13
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:55.313587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile16
Q119
median24
Q330
95-th percentile37
Maximum49
Range36
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.570701702
Coefficient of variation (CV)0.2603146849
Kurtosis0.5428638789
Mean25.24137931
Median Absolute Deviation (MAD)5
Skewness0.6519394127
Sum5124
Variance43.17412086
MonotonicityNot monotonic
2022-07-26T23:05:55.437500image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
3128
13.8%
1927
13.3%
2422
10.8%
2714
 
6.9%
1713
 
6.4%
2612
 
5.9%
2310
 
4.9%
218
 
3.9%
258
 
3.9%
308
 
3.9%
Other values (19)53
26.1%
ValueCountFrequency (%)
131
 
0.5%
142
 
1.0%
153
 
1.5%
166
 
3.0%
1713
6.4%
183
 
1.5%
1927
13.3%
203
 
1.5%
218
 
3.9%
224
 
2.0%
ValueCountFrequency (%)
491
 
0.5%
471
 
0.5%
451
 
0.5%
387
3.4%
376
3.0%
361
 
0.5%
351
 
0.5%
341
 
0.5%
331
 
0.5%
321
 
0.5%

highway_mpg
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct30
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.74876847
Minimum16
Maximum54
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:55.569946image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile22
Q125
median30
Q335
95-th percentile42.9
Maximum54
Range38
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.920405754
Coefficient of variation (CV)0.2250628594
Kurtosis0.4073312252
Mean30.74876847
Median Absolute Deviation (MAD)5
Skewness0.5384788366
Sum6242
Variance47.8920158
MonotonicityNot monotonic
2022-07-26T23:05:55.693751image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
2519
 
9.4%
3817
 
8.4%
2417
 
8.4%
3016
 
7.9%
3216
 
7.9%
3414
 
6.9%
3713
 
6.4%
2813
 
6.4%
2910
 
4.9%
339
 
4.4%
Other values (20)59
29.1%
ValueCountFrequency (%)
162
 
1.0%
171
 
0.5%
182
 
1.0%
192
 
1.0%
202
 
1.0%
228
3.9%
237
 
3.4%
2417
8.4%
2519
9.4%
263
 
1.5%
ValueCountFrequency (%)
541
 
0.5%
531
 
0.5%
501
 
0.5%
472
 
1.0%
462
 
1.0%
434
 
2.0%
423
 
1.5%
413
 
1.5%
392
 
1.0%
3817
8.4%

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct185
Distinct (%)91.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12982.47783
Minimum0
Maximum45400
Zeros4
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2022-07-26T23:05:55.846352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5506.3
Q17649
median10245
Q316500
95-th percentile32500.2
Maximum45400
Range45400
Interquartile range (IQR)8851

Descriptive statistics

Standard deviation8111.953571
Coefficient of variation (CV)0.6248386229
Kurtosis3.006376555
Mean12982.47783
Median Absolute Deviation (MAD)3390
Skewness1.683058301
Sum2635443
Variance65803790.74
MonotonicityNot monotonic
2022-07-26T23:05:55.998988image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04
 
2.0%
89212
 
1.0%
181502
 
1.0%
78982
 
1.0%
77752
 
1.0%
88452
 
1.0%
72952
 
1.0%
76092
 
1.0%
66922
 
1.0%
62292
 
1.0%
Other values (175)181
89.2%
ValueCountFrequency (%)
04
2.0%
51181
 
0.5%
51511
 
0.5%
51951
 
0.5%
53481
 
0.5%
53891
 
0.5%
53991
 
0.5%
54991
 
0.5%
55722
1.0%
60951
 
0.5%
ValueCountFrequency (%)
454001
0.5%
413151
0.5%
409601
0.5%
370281
0.5%
368801
0.5%
360001
0.5%
355501
0.5%
350561
0.5%
341841
0.5%
340281
0.5%

Interactions

2022-07-26T23:05:48.057477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:33.350008image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:34.814512image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:36.360499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:37.773741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:39.331830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:40.751776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:42.132521image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:43.566293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:45.127948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:46.570431image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:48.188406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:33.491610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:34.948425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:36.487154image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:37.908344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:39.480324image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:40.879860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:42.284126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:43.716550image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:45.266309image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:46.699781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:48.313750image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:33.622838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:35.187872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:36.623653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:38.048542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:39.610263image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:41.011808image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:42.402725image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:43.843199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:45.398441image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:46.839067image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:48.443779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:33.767770image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:35.303530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:36.750502image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:38.164229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:39.729328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:41.139701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:42.546990image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:43.974647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:45.526201image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:46.990736image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:48.554434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:33.896100image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:35.439770image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:36.881894image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:38.313509image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:39.864475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:41.258952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:42.673275image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:44.251293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:45.664922image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:47.117943image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:48.668085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:34.031520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:35.576591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:37.000529image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:38.439187image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:39.994729image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:41.376657image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:42.807698image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:44.377044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:45.785491image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:47.256212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:48.803069image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:34.162894image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:35.700625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:37.137569image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:38.561825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:40.113363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:41.508093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:42.930884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:44.508356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:45.912427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:47.385248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:48.925073image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:34.287034image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:35.832752image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:37.260878image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:38.700961image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:40.242946image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:41.637303image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:43.055084image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:44.633710image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:46.052533image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:47.518942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:49.046757image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:34.421377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:35.951800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:37.388805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:38.826495image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:40.368425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:41.755429image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:43.189616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:44.753746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:46.176926image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:47.643101image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:49.176532image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:34.549779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:36.097098image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:37.507880image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:38.957993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:40.487022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:41.886551image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:43.321022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:44.880641image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:46.309229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:47.774699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:49.300927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:34.694111image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:36.234957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:37.650501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:39.221139image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:40.629464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:42.016463image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:43.458320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:45.012304image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:46.440536image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-26T23:05:47.929741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-07-26T23:05:56.120851image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-26T23:05:56.578942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-07-26T23:05:56.771908image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-26T23:05:56.975400image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-07-26T23:05:57.170767image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-07-26T23:05:49.555809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-26T23:05:49.989517image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-26T23:05:50.176235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

makefuel_typenum_of_doorsbody_styledrive_wheelsengine_locationwheel_baselengthwidthheightcurb_weightengine_typenum_of_cylindersengine_sizefuel_systemcompression_ratiohorsepowercity_mpghighway_mpgprice
0alfa-romerogastwoconvertiblerwdfront88.6168.864.148.82548dohcfour130mpfi9.0111212713495
1alfa-romerogastwoconvertiblerwdfront88.6168.864.148.82548dohcfour130mpfi9.0111212716500
2alfa-romerogastwohatchbackrwdfront94.5171.265.552.42823ohcvsix152mpfi9.0154192616500
3audigasfoursedanfwdfront99.8176.666.254.32337ohcfour109mpfi70.0102243013950
4audigasfoursedan4wdfront99.4176.666.454.32824ohcfive136mpfi8.0115182217450
5audigastwosedanfwdfront99.8177.366.353.12507ohcfive136mpfi8.5110192515250
6audigasfoursedanfwdfront105.8192.771.455.72844ohcfive136mpfi8.5110192517710
7audigasfourwagonfwdfront105.8192.771.455.72954ohcfive136mpfi8.5110192518920
8audigasfoursedanfwdfront105.8192.771.455.93086ohcfive131mpfi8.3140172023875
9audigastwohatchback4wdfront99.5178.267.952.03053ohcfive131mpfi7.016016220

Last rows

makefuel_typenum_of_doorsbody_styledrive_wheelsengine_locationwheel_baselengthwidthheightcurb_weightengine_typenum_of_cylindersengine_sizefuel_systemcompression_ratiohorsepowercity_mpghighway_mpgprice
193volvogasfourwagonrwdfront104.3188.867.257.53034ohcfour141mpfi9.5114232813415
194volvogasfoursedanrwdfront104.3188.867.256.22935ohcfour141mpfi9.5114242815985
195volvogasfourwagonrwdfront104.3188.867.257.53042ohcfour141mpfi9.5114242816515
196volvogasfoursedanrwdfront104.3188.867.256.23045ohcfour130mpfi7.5162172218420
197volvogasfourwagonrwdfront104.3188.867.257.53157ohcfour130mpfi7.5162172218950
198volvogasfoursedanrwdfront109.1188.868.955.52952ohcfour141mpfi9.5114232816845
199volvogasfoursedanrwdfront109.1188.868.855.53049ohcfour141mpfi8.7160192519045
200volvogasfoursedanrwdfront109.1188.868.955.53012ohcvsix173mpfi8.8134182321485
201volvodieselfoursedanrwdfront109.1188.868.955.53217ohcsix145idi23.0106262722470
202volvogasfoursedanrwdfront109.1188.868.955.53062ohcfour141mpfi9.5114192522625

Duplicate rows

Most frequently occurring

makefuel_typenum_of_doorsbody_styledrive_wheelsengine_locationwheel_baselengthwidthheightcurb_weightengine_typenum_of_cylindersengine_sizefuel_systemcompression_ratiohorsepowercity_mpghighway_mpgprice# duplicates
0mitsubishigasfoursedanfwdfront96.3172.465.451.62403ohcfour110spdi7.5116233092792